1,155 research outputs found

    Applications and accuracy of the parallel diagonal dominant algorithm

    Get PDF
    The Parallel Diagonal Dominant (PDD) algorithm is a highly efficient, ideally scalable tridiagonal solver. In this paper, a detailed study of the PDD algorithm is given. First the PDD algorithm is introduced. Then the algorithm is extended to solve periodic tridiagonal systems. A variant, the reduced PDD algorithm, is also proposed. Accuracy analysis is provided for a class of tridiagonal systems, the symmetric, and anti-symmetric Toeplitz tridiagonal systems. Implementation results show that the analysis gives a good bound on the relative error, and the algorithm is a good candidate for the emerging massively parallel machines

    A simple parallel prefix algorithm for compact finite-difference schemes

    Get PDF
    A compact scheme is a discretization scheme that is advantageous in obtaining highly accurate solutions. However, the resulting systems from compact schemes are tridiagonal systems that are difficult to solve efficiently on parallel computers. Considering the almost symmetric Toeplitz structure, a parallel algorithm, simple parallel prefix (SPP), is proposed. The SPP algorithm requires less memory than the conventional LU decomposition and is highly efficient on parallel machines. It consists of a prefix communication pattern and AXPY operations. Both the computation and the communication can be truncated without degrading the accuracy when the system is diagonally dominant. A formal accuracy study was conducted to provide a simple truncation formula. Experimental results were measured on a MasPar MP-1 SIMD machine and on a Cray 2 vector machine. Experimental results show that the simple parallel prefix algorithm is a good algorithm for the compact scheme on high-performance computers

    Efficient Parallel Kernel Solvers for Computational Fluid Dynamics Applications

    Get PDF
    Distributed-memory parallel computers dominate today's parallel computing arena. These machines, such as Intel Paragon, IBM SP2, and Cray Origin2OO, have successfully delivered high performance computing power for solving some of the so-called "grand-challenge" problems. Despite initial success, parallel machines have not been widely accepted in production engineering environments due to the complexity of parallel programming. On a parallel computing system, a task has to be partitioned and distributed appropriately among processors to reduce communication cost and to attain load balance. More importantly, even with careful partitioning and mapping, the performance of an algorithm may still be unsatisfactory, since conventional sequential algorithms may be serial in nature and may not be implemented efficiently on parallel machines. In many cases, new algorithms have to be introduced to increase parallel performance. In order to achieve optimal performance, in addition to partitioning and mapping, a careful performance study should be conducted for a given application to find a good algorithm-machine combination. This process, however, is usually painful and elusive. The goal of this project is to design and develop efficient parallel algorithms for highly accurate Computational Fluid Dynamics (CFD) simulations and other engineering applications. The work plan is 1) developing highly accurate parallel numerical algorithms, 2) conduct preliminary testing to verify the effectiveness and potential of these algorithms, 3) incorporate newly developed algorithms into actual simulation packages. The work plan has well achieved. Two highly accurate, efficient Poisson solvers have been developed and tested based on two different approaches: (1) Adopting a mathematical geometry which has a better capacity to describe the fluid, (2) Using compact scheme to gain high order accuracy in numerical discretization. The previously developed Parallel Diagonal Dominant (PDD) algorithm and Reduced Parallel Diagonal Dominant (RPDD) algorithm have been carefully studied on different parallel platforms for different applications, and a NASA simulation code developed by Man M. Rai and his colleagues has been parallelized and implemented based on data dependency analysis. These achievements are addressed in detail in the paper

    Optimal cube-connected cube multiprocessors

    Get PDF
    Many CFD (computational fluid dynamics) and other scientific applications can be partitioned into subproblems. However, in general the partitioned subproblems are very large. They demand high performance computing power themselves, and the solutions of the subproblems have to be combined at each time step. The cube-connect cube (CCCube) architecture is studied. The CCCube architecture is an extended hypercube structure with each node represented as a cube. It requires fewer physical links between nodes than the hypercube, and provides the same communication support as the hypercube does on many applications. The reduced physical links can be used to enhance the bandwidth of the remaining links and, therefore, enhance the overall performance. The concept and the method to obtain optimal CCCubes, which are the CCCubes with a minimum number of links under a given total number of nodes, are proposed. The superiority of optimal CCCubes over standard hypercubes was also shown in terms of the link usage in the embedding of a binomial tree. A useful computation structure based on a semi-binomial tree for divide-and-conquer type of parallel algorithms was identified. It was shown that this structure can be implemented in optimal CCCubes without performance degradation compared with regular hypercubes. The result presented should provide a useful approach to design of scientific parallel computers

    Distributed computing feasibility in a non-dedicated homogeneous distributed system

    Get PDF
    The low cost and availability of clusters of workstations have lead researchers to re-explore distributed computing using independent workstations. This approach may provide better cost/performance than tightly coupled multiprocessors. In practice, this approach often utilizes wasted cycles to run parallel jobs. The feasibility of such a non-dedicated parallel processing environment assuming workstation processes have preemptive priority over parallel tasks is addressed. An analytical model is developed to predict parallel job response times. Our model provides insight into how significantly workstation owner interference degrades parallel program performance. A new term task ratio, which relates the parallel task demand to the mean service demand of nonparallel workstation processes, is introduced. It was proposed that task ratio is a useful metric for determining how large the demand of a parallel applications must be in order to make efficient use of a non-dedicated distributed system

    Hawking radiation-quasinormal modes correspondence for large AdS black holes

    Get PDF
    It is well-known that the non-strictly thermal character of the Hawking radiation spectrum generates a natural correspondence between Hawking radiation and black hole quasinormal modes. This main issue has been analyzed in the framework of Schwarzschild black holes, Kerr black holes and nonextremal Reissner-Nordstrom black holes. In this paper, by introducing the effective temperature, we reanalysis the non-strictly thermal character of large AdS black holes. The results show that the effective mass corresponding to the effective temperature is approximatively the average one in any dimension. And the other effective quantities can also be obtained. Based on the known forms of frequency in quasinormal modes, we reanalysis the asymptotic frequencies of the large AdS black hole in three and five dimensions. Then we get the formulas of the Bekenstein-Hawking entropy and the horizon's area quantization with functions of the quantum "overtone" number nn.Comment: 6 page

    Dynamic response and dangerous point stress analysis of gear transmission system

    Get PDF
    Gear transmission is the principal power transmission mode of many machine, the reliability of transmission system has important influence on the accomplishment of daily task. This paper made a gear transmission system as the research object, we build the two-stage gear transmission system model and calculate its dynamic response in theory. Then, we study the mesh stiffness of gear concerning the variation of the mesh position from the gear transmission system. On the basis of these work, we establish the gear system’s finite element simulation model considering the tooth contact of internal gear system. After the simulation, we had get the contact response and the time history of some important area’s equivalent stress. Through these work, we can study the contact stress of the two-stage gear system in theory method and finite element simulation method, which has a guiding significance on the optimum structural design of two-stage transmission gear system

    A parallel two-level hybrid method for tridiagonal systems and its application to fast poisson solvers

    Full text link

    A layout-aware optimization strategy for collective i/o

    Get PDF
    ABSTRACT In this study, we propose an optimization strategy to promote a better integration of the parallel I/O middleware and parallel file systems. We illustrate that a layout-aware optimization strategy can improve the performance of current collective I/O in parallel I/O system. We present the motivation, prototype design and initial verification of the proposed layout-aware optimization strategy. The analytical and initial experimental testing results demonstrate that the proposed strategy has a potential in improving the parallel I/O system performance

    catena-Poly[[bis­(acetato-κO)aqua­copper(II)]-μ-5-(pyridin-3-yl)pyrimidine-κ2 N 1:N 5]

    Get PDF
    In the title compound, [Cu(CH3CO2)2(C9H7N3)(H2O)]n, the CuII ion is penta­coordinated in a square-pyramidal geometry. The N atoms of the two chelating symmetry-related 5-(pyridin-3-yl)pyrimidine ligands and the O atoms of the two monodentate acetate anions are nearly coplanar, with a mean deviation from the least-squares plane of 0.157 (2) Å and the CuII ion is displaced by 0.050 (3) Å from this plane towards the apical water O atom. Bridging through the bis-monodentate 5-(pyridin-3-yl)pyrimidine ligand forms a one-dimensional coordination polymer extending parallel to [010]. In the crystal, O—H⋯O hydrogen bonds link the mol­ecules into a two-dimensional supra­molecular structure parallel to (100). The crystal studied was an inversion twin with a 0.57 (3):0.43 (3) domain ratio
    corecore